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File sharing, typically involving video or audio material in which copy 
right may persist and using peer-to-peer (P2P) networks like BitTorrent 



has been reported to make up the bulk of Internet traffic (IPouwelse et al. 



2008 : iKryczka et all 1201 ll) . The free-riding problem a ppears in this "digita 
gift e conomy" but its users exhibit rational behaviour (IBecker and Clement, 
2006), subject to the characteristics of the particular network (IFeldman et al. , 
2006). T he high demand for the Inter net as a delivery channel for enter- 
tainment (Alleman and Rappoport, 2009) underlines the importance of under- 
standing the dynamics of this market, especially when considering po ssible 
business models for future pricing or licensing regimes (|GervaisL 120041) and 
for the provisioning of network capacity to support future services. The avail- 
ability of specific titles on file sharing networks is the focus of this paper, 
with a special emphasis on the P2P protocol BitTorrent. The paper compares 
the incentives provided in BitTorrent to those in other file-sharing communi- 
ties, including file hosting, and discusses the number of titles available in the 
community at any give n time, with an emph asis on popular video items with 
ambiguous legal status (|Watters et all 120111) . 



Introduction 

The main object of this research is to understand 
how incentives operate in the world of (mainly 
anonymous) file sharing, to produce a spectrum 
of available content. The mechanism for sharing 
will clearly influence the amount of content avail- 
able to any specific individual seeking such content 
online. After all, the proliferation of the technol- 
ogy for home audio and video taping in the 1970s 
and 1980s created an environment in which any 
individual had easy access to all material under 
the control of their conventional social peers but 
not more. The legal controversy (Lessig, 120041) of 
then and now is however not the main topic of this 
work. 

The Internet, together with digital multimedia 
formats, has enabled media sharing (like many 



other things) to take place between complete 
strangers all over the globe and with lossless trans- 
mission of the content. The use of file-sharing 
networks for distributing unauthorised copies of 
copyrighted material is of con cern to the me- 
dia a nd publications industries ([Holsapple et al. . 
2008|) and one reaction has been attempts by au- 
thorities in many countries to either block file- 
sharing traffic or to ban access to websites indexing 
such content, recently in Italy and reporte dly even 
more stringently in the United Kingdom (| Arthur . 



2010|) . Given the high per centage of Internet traf- 
fic involving fil e shari n g iGummadi et all 
Menasche et all 120091 : IPlissonneau et all 
Pouwelse et al.. 2008) this is of particular c 
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Pouwelse et al. . 2008) this is of particular concern 
also in the debate around network neutrality. 

The high demand for the Internet as a de- 
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livery channel for audio-vis ual entertainment 



( Alleman and Rappoport . 2009h underlines the im- 
portance of understanding the dynamics of this 
market, especially when considering possible busi- 
ness models for future pricing or licensing regimes 
(|GervaisL 120041 ) and for the provisioning of net- 
work capacity to support future services. This is 
of particular interest when it becomes a matter of 
public policy, for instance where tax-funded in- 
vestment in new generation networks (NGNs) are 
being considered. 

The availability of specific titles on file sharing 
networks is the focus of this paper, with a special 
emphasis on the peer-to-peer (P2P) protocol Bit- 
Torrent. An overview of the operation of the net- 
work will be given and we shal l consi der the model 
proposed by iMenasche et al. I d2009h for content 
avaibility in BitTorrent swarms. Alternatives to 
P2P are briefly considered and finally some factors 
determining the availability of titles are discussed, 
together with data illustrating users' behaviour on 
the network. 

Altruism in file-sharing 

File sharing communities on the Internet are 
characterised by a high degree of anonymity, es- 
pecially in light of the (somewhat remote) possi- 
bility of prosecution or (more likely) threatening 
letters from Internet service providers (ISPs). In- 
deed, these communities are responding to attacks 
from copyright holders by evolving more sophisti- 
cated mechanisms for maintaining anonymity. Un- 
like the taping and mix-taping of earlier decades, 
where the sharing communities presumably coin- 
cided with ordinary social networks, file sharing 
requires a degree of altruism that is not backed 
up by an implicit offline social quid-pro-quo con- 
vention. Nevertheless, a lot of file sharing be- 
tween strangers evidently does take place. The 
role of altruism in these networks has been stud- 



tional behaviour within thi s digital gift economy 
(|Becker and Clemena |2006|) subje ct to the charac 



ied by many people, e.g. iFeldman et al.l ((2006). 
Basically, small costs can be imposed for free- 
riding. Furthermore, there are closed networks 
where this is not a problem and users exhibit ra- 



teristics of the particular network (|Feldman et al. 
20061) 



The principle of P2P networking is illustrated by 
the following toy example. Suppose that a single 
published of some specific content appears on the 
network where a single potential recipient is wait- 
ing. If the recipient (or, leecher) is not prepared to 
donate anything to others on the network, and the 
publisher (or, seeder) is prepared to donate only 
one copy, the net effect will be that the leecher 
downloads a single copy and the publisher donates 
one copy. Further leechers arriving on the network 
and seeking a copy will not be serviced. However, 
should leechers be prepared to act as peers, i.e. to 
let other download from them, then a large number 
of copies can be distributed from a single seeder. 
Suppose n leechers appear simultaneously, while 
the original seeder is available, and each leecher 
is prepared to donate only the equivalent of 
copies of the file. Then, it becomes possible for 
each of the leechers to obtain the entire file in the 
following way. 

1 . Let the k-th leecher download the k-th part of 
the file, of size -, so that the seeder will have do- 
nated only a total of one full copy of the file. 

2. Now, let each leecher donate his/her fraction 



1 



n 

ers. 



of the desired file to each of the n — 1 other leech- 



At this point each leecher will have obtained a full 
copy of the file but will have uploaded less than a 
full copy of the content. 

It is easy to see how some free-riding can be 
accommodated within the system. Suppose there 
were, as above, n peers and one seed but also a sin- 
gle leecher not willing or able to upload any con- 
tent. If each of the n original peers is prepared to 
donate a full copy of the content, only step 2 above 
need be modified to enable the free-riding leecher 
to also obtain a full copy of the content. Obvi- 



! Here publisher signifies any entity or individual 
making a file available to others, and not necessarily 
the copyright owner or offline publisher. 
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ously many more free-riding leechers could be ser- 
viced, should sufficiently many peers be prepared 
to make available even more than a single multi- 
ple of the content. Equally obviously, the ability 
of leechers to download desired content is entirely 
dependent on the willingness of peers to make the 
content available to others. 

The toy example is simplistic and does not even 
incorporate the influence of accidental sharing by 
leechers. In the case of very popular content, most 
P2P clients will automatically share chunks of con- 
tent already downloaded while not yet in posses- 
sion of the complete file and will continue seeding 
the content once the download has been completed 
until either 

• a prescribed share ratio (number of full copy 
equivalents shared to others) has been reached, or 

• the client or torrent is stopped manually by the 
user. 

In the case of popular content just after publica- 
tion, many leechers will unintentionally act in a 
very altruistic way since they will, while waiting 
for a download to complete, facilitate many up- 
loads of the same material to others, also since 
the demand is high at these times. Such uninten- 
tional peering probably contributes significantly to 
the availability of titles shortly after release, for 
example popular television shows in the days af- 
ter which they have been broadcast or otherwise 
leaked to the public. Even though the preceding il- 
lustrates that free-riding can be accommodated in a 
file sharing community, the presence of a commu- 
nity of BitTorrent publishers with a financ i al ince n- 



2010) 



tive has been hypothesized (|Cuevas et al. 

Consider the (perhaps rather unscientific) statis- 
tics published by the website KickAssTor- 
rents.com in Tabled] harvested from the BitTorrent 
tracker PublicBT.com for episodes from Season 
6 of the series Desperate Housewives published 
through the highly regard and prolific EZTV. 
Availability of Episode 1 seems good even after 
several months although the amount of leeching 
going on at a specific time is not very high at all, 
the number of downloaders being just over 1% 



Table 1 

Availability on PublicBT.com for two selected 
episodes of Desperate Housewives 



File name 


Date 


Seeders 


Leechers 


. . . S06E20. . . 


27 April 2010 


10113 


3 863 


...S06E01... 


27 April 2010 


334 


48 



of the number observed less than 48 hours after 
episode 20 of the show was shown on ABC broad- 
cast television in the United States. Clearly the 
amount of free-riding or miserly sharing is much 
higher early in the life of the swarmH 

The BitTorrent network 

The BitTorrent protocol enjoys widespread use 
and has been implemented on many platforms. It 
is also the primary protocol supported by the fa- 
mous PirateBay portal for online content and sub- 
ject of frequent legal action. However, BitTorrent 
(BT) is also used for distributing software such as 
installation discs for new Linux distributions and 



for scientific data (ILangille and EisenL I2010I) . The 
main reason for the success of BT is that it scales 
very well when demand increases. As outlined cur- 
sorily above, a P2P distribution system can eas- 
ily accommodate a very high demand for a certain 
large file of collection of files without a specif- 
ically lri^h_d^gree_of investment at any specific 
node dlzal et al. . 2004 ). In fact, free-riding appears 
to be the only problem other than unavailability of 
content. It will be helpful for the discussion below 
to describe briefly the operation of a BT network, 
from the user's point of view. 

A prospective user of general content will typi- 



or 



cally visit a torrent index site such as PirateBa _ 
AnimeSuk^l where the potential downloader will 
click on a file with the .torrent file extension (or, 



A single title can be available in many swarms 
(IVinko et all 1201 21) . The statistics in Table [TJ are for 
a single swarm. 

3 Invented by Bram Cohen. 

4 http ://thepiratebay.org/ 

5 http ://w w w. animesuki.com/ 
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more recently, a .magnet file). This torrent file 
will contain references to specific torrent tracking 
servers and might well be opened automatically 
by BT client software^ on the user's computer so 
that the user will not necessarily even be aware of 
the torrent file per se. The BT will start querying 
the trackers listed in the torrent file as part of the 
process of joining a swarm. The tracker servers 
will provide information about actual peers already 
in the swarm and the new peer will start request- 
ing, and eventually offerings, parts of the file to be 
download. The download time experienced by the 
user dep ends very much on the conditions in the 
swarm (IChiu and Eunl 12008). 



The BT swarm as M/G/°° queue 



Menasche et al. (2009) describe a BT swarm 



as an M/G/°° queue ([Browne and SteeleL 119931) . 
That is, the swarm is supposed to consist of pub- 
lishers which appear at irregular intervals and 
peers with Poisson arrivals. The queuing system 
is busy as long as the title remains available from 
peers that are online. They deduce estimates for 
the expected length of a busy period under various 
assumptions w.r.t. peer behaviour. For example, 
if all peers are selfish and leave as soon as com- 
pleting their download has been completed, the ex- 
pected length of a busy period is 



B 



,s{r+X)/ii 



1 



r + X 



where s is the size of the file, /! is the mean down- 
load rate of peers, r the arrival rate of publishers 
and X the arrival rate of peers. It is further assumed 
that peers who arrive when the content is not avail- 
able, immediately leave. Publishers are assumed 
to stay online only for a time —, i.e. long enough 
to serve one copy of the content. The reader will 
be able to confirm, at a glance, that B behaves as 
common sense would suggest when /i — > oo and 
A — >■ oo. One can also observe that doubling the 
size of the file from s to 2s would increase the size 



of B by a factor of 

e 2s(r+X)/n _ l 
e s(r+X)/ii _ i 

which is very substantial. The busy periods will 
generally be much longer in a swarm with file size 
2s than the sum of the expected b usy times of two 



separate swarms with file size s. iMenasche et al 



(120091) obtain much more precise mathematical es- 
timates of the advantage in terms of availability in 
large swarms and could thereby present a convinc- 
ing explanation of content bundling 

Predictions by the model of IMenasche et al 



(120091) are consistent with a vast amount of data on 
real networks investigated in their study. The data 
in their work showed 80% of swarms as unavail- 
able at least 80% of the time. Nevertheless, their 
model might be inappropriate for the large amount 
of BT activity that consists of the distribution of 
material within the first few weeks (or days) after 
it appears. If this initial burst of activity is take into 
account, the arrival rate of publishers and peers is 
certainly far from constant, as one can clearly see 
in Table[2]where it seems clear that activity peaked 
about 36 hours after the swarm was constituted and 
the declined rather rapidly. The M/G/°° queue 
also has no room for incorporating individual be- 
haviour or system- wide constraints such as the to- 
tal amount of available storage and bandwidth and 
possibly limited demand for audio-visual content. 

A fluid model 



Oiu and Srikantl (120041) use partical differential 
equations to describe a fluid model for BT s warms. 
As om the model of IMenasche et al.1 (|2009|) the pa- 
rameters like peer arrival are assumed to be con- 



6 For example, Vuze or BitTornado. 

7 Content bundling is the phenomenon where indi- 
vidual books or episodes of a television series are not 
available as single files but only in a bundled torrent 
consisting, for example, of hundreds of e-books on a 
particular topic, or of all of a specific season of a tele- 
vision series. 
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Date and time 



Seeding Leeching 



27 April 2010, 15:32:30 10113 3 863 
27 April 2010, 19:15:01 11 187 5 132 
27 April 2010, 21:19:10 11000 4 872 

27 April 2010, 22:31:42 9 664 4468 

28 April 2010, 22:16:10 7 701 2445 

29 April 2010, 05:20:27 4 640 1 078 

29 April 2010, 21:32:32 6 825 1887 

30 April 2010, 07:35:36 4416 840 
Table 2 

Torrent activity for Desperate Housewives S06E20 
during the week after its first broadcast, from kick- 
asstorrents.com, reported for tracker.publicbt.com 
and a swarm formed on 26 April 2010 with purpo- 
sive sampling. 



stant. They prove the existence of a Nash equilib- 
rium under certain conditions and have experimen- 
tal data that appears to validate their model under 
the given assumptions. 

Anonymous file hosting 

Anonymous file hosting is another way of shar- 
ing content when the publisher does not necessar- 
ily want to do so openly, possibly out of fear of 
political persecution, harrassment or of prosecu- 
tion for possibly copyright violations. An anony- 
mous file host allows users to upload a file to 
an Internet web page with a generic named that 
gives no indication of the content of the file stored 
there. The uploader might place a link to the web 
page in an Internet forum or circulate it in another 
way. Registration is not required of casual users 
but the business model of these providers of one- 
click hosting evidently includes enticing users to 
take out a subscription which allows downloading 
files without the waiting time imposed on casual 
users of the free service. Subscribers also enjoy 
faster download speeds and file hosting offers a far 
greater degree of ano nymity than P2P distribution 
( Blond et al. . 2010a b|) i-a. since both publishers 
and peers are exposed for only as long as it takes to 
transfer the file from/to the hosting site. It is quite 



clear that one-click hosting has created a revenue 
model for conte nt sharing, whether lega l or pos- 
sibly illegal, and Antoniades et al.l ([2009) observe 
that of a list of 100 unpopular film titles, more are 
available on RapidShare than on BitTorrent. 

Title availability 

In this section, we consider the characteristics 
of an ideal and fairly complete model of title avail- 
ability. First, consider the salient features of the 
networked digital multi-media world. 



Hollywood universality lAntoniades et all (I2009|) 
found 100% availability on PirateBay for the 
top 50 US DVD rental titles for the week 
under investigation but only 76% availabil- 
ity for Amazon's top 25 German films of all 
time. Anecdotal evidence suggests that pop- 
ular US television series become available 
within a few hours of airing and are nearly 
universally available because of the ease of 
digital home recording. 



Ease of copy The cost of copying a computer file 
is nearly zero and does not degrade the 
original copy. Hence, even compared to 
relatively inexpensive media such as CD 
or DVD discs, the marginal cost of re- 
production is exceptionally low for com- 
puter copies. Further, digital rights man- 
agement (DRM) is not particularly efficient 
and there is no limit to the number of copies 
of a particular item a consumer can make. 
As more consumers source their entertain- 
ment from unencrypted digital sources, the 
smaller the impact of DRM will become. 

Storage efficiency Digital media can be quite ef- 
ficiently stored. A single portable computer 
hard disk can store hundreds or thousands of 
films in a space no greater than that of a sin- 
gle DVD box. 



8 http://rapidshare.com/files/16433818/ for example. 
9 Not least of all because of the analog hole. 
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Open borders There are no customs inspections 
on the Internet and with relatively inexpen- 
sive bandwidth in many places, the cost of 
transmitting digital content is low. Stor- 
age capacity of offline media has not in- 
creased sufficiently quickly in order to make 
the physical transport of media files an effi- 
cient proposition. 

In view of the German film example, we sup- 
pose only that almost all English-language content 
produced for US television or cinema audiences, 
is available in digital format on some computer 
which is connected to the Internet. Such material 
would, for many, constitute a perfectly acceptable 
source of entertainment, perhaps to the extent that 
they would need no other. It is not quite the case 
yet but one might also assume that each poten- 
tial consumer has sufficient storage space available 
to archive a full copy of this corpus. The corpus 
grows only slowly and perhaps slowly enough that 
a typical residential user in the US could download 
the entire accrual each day. 

Assume, for argument's sake, that Hollywood 
is somehow wiped out and that the corpus stops 
growing. In this case, it would be of benefit to 
each user to download the entire corpus and this 
would be possible, in principle, using P2P if each 
user is only prepared to contribute approximately 
as much as they download. This would be a real 
socialist Utopia, but for the problem of enforcing 
cooperation and punishing free-riding. Since even 
BT i s subject to opport unistic strategic manipula- 
tion (ILevin et all |2008|) the availability of titles is 
still very much subject to constraints on storage, 
network use and the problem of free-riding. How- 
ever, many tests have shown BT to be an efficient 
allocator of resources and P2P could very well be 
one of the mechanisms that would assist in a fair 
allocation of network resources on the Interent, a 
network not initially designed for commercial use. 

In the discussion above, it has been taken for 
granted that consumers of content have a fixed 
preference in terms of quality. That is, given a 
specific unit of artistic content (an episode of Des- 



perate Housewives, for example) corresponds to 
a specific chunk of data with associated storage 
and transmission costs. However, the prolifera- 
tion of high-definition display devices means that 
consumers now increasingly prefer high-definition 
content. High-definition video requires file sizes 
that are several times those of standard definition 
content and anecdotal evidence suggest that the 
spectrum of content available in high-definition 
formats (1080p or 720p) is still far below that of 
the old standard television resolution. 

Conclusion 

Protocol design creates incentives that 
determine, among other things, the range of 
titles available on a given platform. Online sharing 
involves relatively high costs (e.g. the decrypting, 
copying and recoding or "ripping" of a com- 
mercial DVD in a compressed and unencrypted 
digital format) incurred by publishers of material 
and much smaller costs for those who simply 
share files which they have already downloaded. 
The world of online file- sharing is not simply a 
free-for-all of illegal material but rather an almost 
organic self-organising economy which manages 
to allocate scarce resources and supply consumer 
demand. It has even been suggested that digital 
downloads often help to in crease sales offline 
(|Peitz and Waelbroeckl . |2006). The modelling of 
P2P networking presents a fascinating opportunity 
for network economists and applied mathemati- 
cians, with much to be done to formulate a 
realistic model that properly incorporates the 
dynamics of the system. Observing the spread 
of high-definition content will provide further 
data but will also require a model of consumers' 
preference for high-quality content. 
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